Dataset Shape Overview

This bar chart provides a simple overview of the dataset's structure by showing the total number of rows and columns.
We started with 80.199rows and 37 columns.
After cleaning we are left with 44.174 rows and 14 columns and we added price/m² for a total of 15 columns.
No description has been provided for this image

Distribution of All Properties by Surface Area

This histogram shows the distribution of all properties based on their habitable surface in m².
It helps identify whether there are common property sizes, and shows any skewness in the data (e.g. many small apartments vs few large houses).
As you can see: This chart includes some extreme values which skew the visual impression and make the chart alomst unusable.
No description has been provided for this image

Distribution of Surface Area, Without Outliers

This version of the previous chart removes the top 1% of properties with the largest surface areas, in the 99th percentile.
The goal is to focus on the majority of properties without being distorted by a few very large ones.
This provides a clearer view of the typical distribution of property sizes.
As we can see from the graph, the majority of properties lie around the 100m² to 200m² point, with the peak being at 100m²
No description has been provided for this image

Property Surface Area by Subtype

This interactive Plotly histogram shows the surface distribution from the previous chart, but now split by property subtype (e.g. house, apartment, villa).
We're still filtering to the 99th percentile to remove extreme values.
We overlayed the subtypes, allowing us to compare subtypes within the same space while still seeing overlap.
We can now start filtering on subtype so we can see how the distribution among individual subtypes

Price Corrolation Heatmap

In this heatmap we can see the corrolation of all features to eachother,
but most important, their corrolation to the price.
 

All feature correlations with price:

- price_per_m2: 0.57
- habitablesurface: 0.51
- bedroomcount: 0.37
- hasswimmingpool: 0.29
- building_condition: 0.22
- hasterrace: 0.09
- hasgarden: 0.05
- haslift: 0.05
No description has been provided for this image

Top 5 Heatmap

In this heatmap we can see the top 5 most important features corrolating to price.

Top 5 features most correlated with price:

- price_per_m2: 0.57
- habitablesurface: 0.51
- bedroomcount: 0.37
- hasswimmingpool: 0.29
- building_condition: 0.22
No description has been provided for this image